Skip to content

Support security group names in VPC Resource controller #389

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 26 commits into from

Conversation

GnatorX
Copy link
Contributor

@GnatorX GnatorX commented Mar 27, 2024

Issue #, if available: #20

Description of changes:
Add support for Security group names

Credit: Most of the code was implemented by @abeer-stripe

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@GnatorX GnatorX marked this pull request as ready for review March 28, 2024 21:04
@GnatorX GnatorX requested a review from a team as a code owner March 28, 2024 21:04
@GnatorX
Copy link
Contributor Author

GnatorX commented Mar 28, 2024

I have tested this, with some augmentation, on our cluster and it seems to work. We would need EKS to test this because we had to remove CreateNetworkInterfacePermission from the code (not something we can / need to do).

@GnatorX GnatorX changed the title DRAFT: Support security group names in VPC Resource controller Support security group names in VPC Resource controller Mar 28, 2024
@sushrk sushrk self-assigned this Apr 1, 2024
@mattbrandman
Copy link

This would also be super useful for tying together security group policies generated by TF with a naming schema. Right now we use a cli to generate the sg pull the id out and then set it as a tf var for import and a variable for argocd. This would be a huge improvement in that workflow

@sushrk
Copy link
Contributor

sushrk commented Apr 11, 2024

Thanks for the PR, I'm going to take a look this week.

// GroupNames contains the list of security group names that will be applied to the network interface of the pod matching the criteria.
type GroupNames struct {
// Groups is the list of EC2 Security Group Names that need to be applied to the ENI of a Pod.
// +kubebuilder:validation:MinItems=1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have MinItems=0.
MinItems=1 would enforce that customers specify at least 1 SG name. What if they want to specify only SG IDs?

We need to enforce that both lists combined is not empty though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya thats a good call out, let me see what we can do here

Copy link
Contributor Author

@GnatorX GnatorX Apr 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The non-empty check I believe is checked on CreateNetworkInterface. Do you have different place in mind?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it would be good to include in CRD validation, let me see how to do this with kubebuilder.

}

// For unit testing
func newEC2APIHelper(ec2Wrapper EC2Wrapper, clusterName, workerNodeVpcId string,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not required, you can use getMockWrapper in helper_test.go here and pass in any mock VPC ID.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We added this here because need to to expose the internal cache for testing.

}
sgNamesNotInCache := make([]string, 0)
for _, sgName := range securityGroupNames {
if sgId, ok := h.securityGroupNameToIdMap[sgName]; !ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure caching is a good idea here.

Consider the scenario where a security group policy includes sg-example-name corresponding to sg-1111. This is added to securityGroupNameToIdMap, and some pod A is launched with sg-1111 attached to the branchENI. Now, customer deleted sg-1111 and created sg-2222 with the same name sg-example-name. But the map still points to old sg-1111 which no longer exists. So any pods launched after would be stuck in creation process as branchENI cannot be created(due to missing SG).

Need to evaluate if we always have to do an API call during pod creation or we can consider periodically refreshing the cache (TTL caching).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a TTL caching make sense given (at least from our side) how rare we change delete security group and recreate with the same name. The cache was added to reduce the number of calls we hit EC2 API with.

}

// GetSourceAcctAndArn constructs source acct and arn and return them for use
func GetSourceAcctAndArn(roleARN, region, clusterName string) (string, string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why this is moved from pkg/utils/helper.go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a circular dependency which required moving a few things out of utils such as this and the httpclient code

garvinp-stripe and others added 3 commits April 15, 2024 14:01
PR changes: Move to getsgforVPC, move SG names into SG, unique items …
@GnatorX
Copy link
Contributor Author

GnatorX commented Apr 15, 2024

Updated the PR. Let me know what you think about TTL cache. If you want to move in that direction. Do you have a recommendation of TTL cache, if not I will just pick up https://github.com/kubernetes/client-go/blob/v0.29.3/tools/cache/expiration_cache.go#L38

@GnatorX GnatorX requested a review from sushrk April 16, 2024 17:02
@sushrk
Copy link
Contributor

sushrk commented Apr 16, 2024

Updated the PR. Let me know what you think about TTL cache. If you want to move in that direction. Do you have a recommendation of TTL cache, if not I will just pick up https://github.com/kubernetes/client-go/blob/v0.29.3/tools/cache/expiration_cache.go#L38

Thanks, will go through this.
Can you also rebase and squash the commits? Would make it easier to review and merge the PR.

@GnatorX
Copy link
Contributor Author

GnatorX commented Apr 17, 2024

I am going to close this against #408 because it was simpler to rebase + squash

@GnatorX GnatorX closed this Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants